Generalized Ewens–Pitman model for Bayesian clustering
نویسنده
چکیده
We propose a Bayesian method for clustering from discrete data structures that commonly arise in genetics and other applications. This method is equivariant with respect to relabelling units; unsampled units do not interfere with sampled data; and missing data do not hinder inference. Cluster inference using the posterior mode performs well on simulated and real datasets, and the posterior predictive distribution enables supervised learning based on a partial clustering of the sample.
منابع مشابه
A Markov random field-regulated Pitman-Yor process prior for spatially constrained data clustering
In this work, we propose a Markov random field-regulated Pitman–Yor process (MRF-PYP) prior for nonparametric clustering of data with spatial interdependencies. The MRF-PYP is constructed by imposing a Pitman–Yor process over the distribution of the latent variables that allocate data points to clusters (model states), the discount hyperparameter of which is regulated by an additionally postula...
متن کاملConsistency in Latent Allocation Models
A probabilistic formulation for latent allocation models was introduced in the machine learning literature by Blei et al. (2003) in the study of a corpora of documents. This article addresses the consistency properties of various posterior probabilities on the space of latent allocations, focusing on the “bag of words” model. It is shown that the Latent Dirichlet Allocation and Ewens-Pitman pri...
متن کاملRegeneration in random combinatorial structures
Theory of Kingman’s partition structures has two culminating points • the general paintbox representation, relating finite partitions to hypothetical infinite populations via a natural sampling procedure, • a central example of the theory: the Ewens-Pitman two-parameter partitions. In these notes we further develop the theory by • passing to structures enriched by the order on the collection of...
متن کاملLong-run Behavior of Macroeconomic Models with Heterogeneous Agents: Asymptotic Behavior of One- and Two-Parameter Poisson-Dirichlet Distributions
This paper discusses asymptotic behavior of oneand two-parameter PoissonDirichlet models, that is, Ewens models and its two parameter extensions by Pitman, and show that their asymptotic behavior are very different. The paper shows asymptotic properties of a class of oneand twoparameter Poisson-Dirichlet distribution models are drastically different. Convergence behavior is expressed in terms o...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کامل